Skip to content

bugfix(network): Reset network command id to keep commands in chronological order#2736

Open
Caball009 wants to merge 6 commits into
TheSuperHackers:mainfrom
Caball009:fix_network_commandID_overflow
Open

bugfix(network): Reset network command id to keep commands in chronological order#2736
Caball009 wants to merge 6 commits into
TheSuperHackers:mainfrom
Caball009:fix_network_commandID_overflow

Conversation

@Caball009
Copy link
Copy Markdown

@Caball009 Caball009 commented May 19, 2026

Synchronized network commands are given a unique id so that they can be sorted chronologically. The id needs to be unique per execution frame. It's a uint16 value so it'll overflow quite fast, especially because it's never reset and the starting value is 64000 by default.

When the network runahead decreases it's expected that multiple frames are executed right after each other. I noticed an issue with the overflow when I set the CRC interval to once per frame. When the overflow happened for a client combined with a decrease of the runahead, the order of the CRC messages was no longer guaranteed to be in chronological order for that execution frame. That would cause a mismatch if the overflow didn't happen for all clients at the same time.

With a starting value of 64000 and a CRC message every frame the first overflow happens after ~25 seconds. You can see it mismatch here because of the overflow (this was with a special test build);
https://www.youtube.com/live/V_l-q9Y-DmA?t=621s
https://www.youtube.com/live/V_l-q9Y-DmA?t=9499s

I've added some test code in the second commit. If that's used without this fix, it becomes very apparent that the messages are not in the right order when the command id overflows. ~On top of that, it's very likely that the disconnect bug (disconnect screen with everyone at 59XXX msec) occurs frequently.

I intend to do a follow up PR to make the static commandID variable a part of the NetCommandMsg class.

@Caball009 Caball009 added Bug Something is not working right, typically is user facing Major Severity: Minor < Major < Critical < Blocker Network Anything related to network, servers Gen Relates to Generals ZH Relates to Zero Hour labels May 19, 2026
@Caball009 Caball009 force-pushed the fix_network_commandID_overflow branch 2 times, most recently from ee6bcd8 to 1f31061 Compare May 31, 2026 18:32
@Caball009 Caball009 marked this pull request as ready for review May 31, 2026 18:46
@Caball009 Caball009 requested a review from Mauller May 31, 2026 18:46
@greptile-apps
Copy link
Copy Markdown

greptile-apps Bot commented May 31, 2026

Greptile Summary

This PR fixes a command-ID overflow bug in the network synchronization layer. The static commandID in GenerateNextCommandID() was a function-local uint16 starting at 64000 that was never reset, causing it to wrap around to 0 every ~1535 commands — and when the overflow didn't coincide across all clients it produced out-of-order CRC/FRAMEINFO messages and disconnect bugs.

  • NetworkUtil.cpp: Lifts commandID to a file-scope static (s_commandID) initialized to 0, adds GetCommandID() and ResetCommandID() helpers, and changes from pre-increment to post-increment semantics.
  • Network.cpp: Calls ResetCommandID() in init() for a clean start, and in update() resets the counter to 0 whenever it reaches 64000 and the network is transitioning to a new execution frame.
  • NetCommandList.cpp: Adds a comment documenting why overflow handling in the sort is intentionally deferred.

Confidence Score: 5/5

Safe to merge; the change is a targeted fix to a well-understood overflow bug with no regressions introduced.

The reset is placed correctly before GetCommandsFromCommandList() in every update cycle, the threshold of 64000 stays well below the uint16 maximum, and the init()-level reset ensures a clean start each game. The guard condition mirrors the exact predicate used by getExecutionFrame(), so the reset fires once per new execution frame on all deterministic clients simultaneously.

No files require special attention.

Important Files Changed

Filename Overview
Core/GameEngine/Source/GameNetwork/NetworkUtil.cpp Refactors the command-ID counter to file scope and adds GetCommandID()/ResetCommandID(); semantics change from pre- to post-increment is intentional and correct.
Core/GameEngine/Source/GameNetwork/Network.cpp Adds reset in init() and a threshold-based reset in update() before command collection; guard condition correctly ensures reset only fires when entering a new execution frame.
Core/GameEngine/Include/GameNetwork/networkutil.h Two new function declarations added cleanly; header already uses #pragma once.
Core/GameEngine/Source/GameNetwork/NetCommandList.cpp Documentation-only change: adds a rationale comment explaining the known overflow limitation in addMessage() sorting.

Sequence Diagram

sequenceDiagram
    participant GL as GameLogic
    participant NET as Network::update()
    participant NU as NetworkUtil
    participant CCL as GetCommandsFromCommandList()
    participant NCL as NetCommandList

    GL->>NET: frame advances
    NET->>NU: GetCommandID()
    alt "commandID >= 64000 AND new execution frame"
        NET->>NU: ResetCommandID()
    end
    NET->>CCL: collect game messages
    CCL->>NU: GenerateNextCommandID()
    CCL->>NCL: addMessage(cmd)
    NCL-->>NCL: sort by commandID
    NCL-->>NET: ordered command list
Loading

Reviews (6): Last reviewed commit: "Changed implementation." | Re-trigger Greptile

@Caball009 Caball009 force-pushed the fix_network_commandID_overflow branch from 1f31061 to ebf1fa2 Compare May 31, 2026 18:59
Comment thread Core/GameEngine/Source/GameNetwork/NetworkUtil.cpp Outdated
@Caball009 Caball009 force-pushed the fix_network_commandID_overflow branch from ebf1fa2 to a7ff356 Compare June 2, 2026 11:45
@helmutbuhler
Copy link
Copy Markdown

Resetting the command id just delays the potential mismatch, right? Wouldn't it be better to sort the command ids properly, even if that rarely causes mismatches with retail clients?

@Caball009
Copy link
Copy Markdown
Author

Resetting the command id just delays the potential mismatch, right?

It's reset every N frames, so it's not just delayed.

@Caball009 Caball009 force-pushed the fix_network_commandID_overflow branch from a7ff356 to 3ae2622 Compare June 3, 2026 01:29
Copy link
Copy Markdown

@xezon xezon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The id needs to be unique per execution frame.

Can we just reset it to 0 or 1 at the beginning of every new frame? Or does that cause issues? What is the reason for it to reset every 128 frames now?

Comment thread Core/GameEngine/Source/GameNetwork/NetworkUtil.cpp Outdated
static UnsignedShort commandID = 64000;
++commandID;
return commandID;
static UnsignedShort commandID = 0;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it ok to start it at 0 when retail clients still start at 64000 ? I suspect this means every client can make up his own Id numbers as he pleases ?

Any clue why the original author picked 64000 as a start for it ?

Copy link
Copy Markdown
Author

@Caball009 Caball009 Jun 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it ok to start it at 0 when retail clients still start at 64000 ?

Yes, this is ok. It's an important argument in favor of resetting the command ID as opposed to fixing the sorting implementation. The latter is not retail-compatible and relatively complicated.

I suspect this means every client can make up his own Id numbers as he pleases ?

Yes, each client can use their own value.

Any clue why the original author picked 64000 as a start for it ?

Not sure. Perhaps they were testing overflow behavior and forgot to change it back. If that's correct, I suppose the question becomes why didn't they notice the issues that come with overflows. They didn't have multiple instances so they'd be more constrained in their ability to stress test it, but I'm only speculating.

Comment thread Core/GameEngine/Source/GameNetwork/Network.cpp

const UnsignedInt frame = TheGameLogic->getFrame();
if (m_localStatus == NETLOCALSTATUS_INGAME) {
if (frame + m_runAhead > m_lastExecutionFrame) {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not understand this condition. Can you explain it?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's extracted from Network::getExecutionFrame. It should prevent the command id from getting reset in the middle of an execution frame.

@Caball009 Caball009 force-pushed the fix_network_commandID_overflow branch from 3ae2622 to 93c5ce7 Compare June 4, 2026 17:07
@helmutbuhler
Copy link
Copy Markdown

When it's reset every 128 frames, isn't it possible that a packet from frame 127 comes in after it's already reset and then gets sorted wrong?

@xezon
Copy link
Copy Markdown

xezon commented Jun 5, 2026

My current understanding is that the ids only matter per frame, not across different frames. But then I do not understand why it needs to be reset every 128 frames instead of every frame.

@Caball009 Caball009 force-pushed the fix_network_commandID_overflow branch from 93c5ce7 to 1bccb3a Compare June 7, 2026 18:54
@Caball009
Copy link
Copy Markdown
Author

I noticed that at extremely high latency (beyond 64 frames runahead) the disconnect screen could be triggered by the reset somehow. I tried a ton of things, but I was unable to find out why that happens. That's why I made some changes to the implementation and only reset the command id when it gets close to overflowing.

When it's reset every 128 frames, isn't it possible that a packet from frame 127 comes in after it's already reset and then gets sorted wrong?

if (frame + m_runAhead > m_lastExecutionFrame)

This check should guard against that. If the check is inverted, you'll see wrong sorting once in a while.

My current understanding is that the ids only matter per frame, not across different frames. But then I do not understand why it needs to be reset every 128 frames instead of every frame.

If the runahead decreases the ids matter across multiple frames.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Bug Something is not working right, typically is user facing Gen Relates to Generals Major Severity: Minor < Major < Critical < Blocker Network Anything related to network, servers ZH Relates to Zero Hour

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants